Skip to content

Conversation

rolandschulz
Copy link
Contributor

Implements #17832

@rolandschulz rolandschulz requested a review from a team as a code owner April 15, 2025 00:51
// Overload taking zero-argument function object
template <class F, int Dim>
std::enable_if_t<std::is_invocable_v<const F &>, sycl::event>
launch(const F &f, const sycl::nd_range<Dim> &range,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unsure what order of argument is best:

  • Advantages of lambda first (this): function name (as part of lambda) is first like in cuda and other overloads
  • Advantage of lambda last (like in https://github.com/codeplaysoftware/cutlass-fork/pull/305/files): consistent with SYCL parallel_for and sycl_khr_free_function_commands. Function arguments (as part of lambda) are last like in cuda and other overloads.

}
// Alternative launch through dim3 objects
template <class F>
std::enable_if_t<std::is_invocable_v<const F &>, sycl::event>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know that this is unchanged, but I think you will want an overload of these launch functions that do not return an event, and that use the new enqueue_functions API (nd_launch): see https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/experimental/sycl_ext_oneapi_enqueue_functions.asciidoc
This is much closer to native cuda and does lead to a very noticeable performance improvement for small kernels (relevant to cutlass), because the overhead of creating events in cuda/hip is big.

I do not know whether having the function calling nd_launch in the implementation might change other design decisions here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree we want a version which doesn't return an event and calls nd_launch. I don't think internally calling nd_launch will change the design decision here.

An overload isn't a good way to add a version which doesn't return an event. Given that one can't overload by return type, it would need to be a property or something like it to opt-out of the event. But getting an event back should be opt-in nor opt-out.

It would be best if we could just remove the return value. But I assume we don't want to break API.

If so I think we need to add a new function like launch2 or launch_fast or something similar which doesn't return an event.

@sarnex sarnex requested a review from a team as a code owner October 1, 2025 20:42
@sarnex sarnex requested a review from aelovikov-intel October 1, 2025 20:42
@aelovikov-intel
Copy link
Contributor

I think this should simply be closed. From the linked GH issue:

syclcompat is going to be deprecated and removed (see #19861) - as such I will close the issue as won't fix.

@rolandschulz , any objections to closing this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants